14 research outputs found

    Rice Galaxy: An open resource for plant science

    Get PDF
    Background: Rice molecular genetics, breeding, genetic diversity, and allied research (such as rice-pathogen interaction) have adopted sequencing technologies and high-density genotyping platforms for genome variation analysis and gene discovery. Germplasm collections representing rice diversity, improved varieties, and elite breeding materials are accessible through rice gene banks for use in research and breeding, with many having genome sequences and high-density genotype data available. Combining phenotypic and genotypic information on these accessions enables genome-wide association analysis, which is driving quantitative trait loci discovery and molecular marker development. Comparative sequence analyses across quantitative trait loci regions facilitate the discovery of novel alleles. Analyses involving DNA sequences and large genotyping matrices for thousands of samples, however, pose a challenge to non−computer savvy rice researchers. Findings: The Rice Galaxy resource has shared datasets that include high-density genotypes from the 3,000 Rice Genomes project and sequences with corresponding annotations from 9 published rice genomes. The Rice Galaxy web server and deployment installer includes tools for designing single-nucleotide polymorphism assays, analyzing genome-wide association studies, population diversity, rice−bacterial pathogen diagnostics, and a suite of published genomic prediction methods. A prototype Rice Galaxy compliant to Open Access, Open Data, and Findable, Accessible, Interoperable, and Reproducible principles is also presented. Conclusions: Rice Galaxy is a freely available resource that empowers the plant research community to perform state-of-the-art analyses and utilize publicly available big datasets for both fundamental and applied science

    Characterization of growth and metabolism of the haloalkaliphile Natronomonas pharaonis

    Get PDF
    Natronomonas pharaonis is an archaeon adapted to two extreme conditions: high salt concentration and alkaline pH. It has become one of the model organisms for the study of extremophilic life. Here, we present a genome-scale, manually curated metabolic reconstruction for the microorganism. The reconstruction itself represents a knowledge base of the haloalkaliphile's metabolism and, as such, would greatly assist further investigations on archaeal pathways. In addition, we experimentally determined several parameters relevant to growth, including a characterization of the biomass composition and a quantification of carbon and oxygen consumption. Using the metabolic reconstruction and the experimental data, we formulated a constraints-based model which we used to analyze the behavior of the archaeon when grown on a single carbon source. Results of the analysis include the finding that Natronomonas pharaonis, when grown aerobically on acetate, uses a carbon to oxygen consumption ratio that is theoretically near-optimal with respect to growth and energy production. This supports the hypothesis that, under simple conditions, the microorganism optimizes its metabolism with respect to the two objectives. We also found that the archaeon has a very low carbon efficiency of only about 35%. This inefficiency is probably due to a very low P/O ratio as well as to the other difficulties posed by its extreme environment

    Identification of candidate genes for drought stress tolerance in rice by the integration of a genetic (QTL) map with the rice genome physical map

    No full text
    Genetic improvement for drought stress tolerance in rice involves the quantitative nature of the trait, which reflects the additive effects of several genetic loci throughout the genome. Yield components and related traits under stressed and well-water conditions were assayed in mapping populations derived from crosses of Azucena×IR64 and Azucena×Bala. To find the candidate rice genes underlying Quantitative Trait Loci (QTL) in these populations, we conducted in silico analysis of a candidate region flanked by the genetic markers RM212 and RM319 on chromosome 1, proximal to the semi-dwarf (sd1) locus. A total of 175 annotated genes were identified from this region. These included 48 genes annotated by functional homology to known genes, 23 pseudogenes, 24 ab initio predicted genes supported by an alignment match to an EST (Expressed sequence tag) of unknown function, and 80 hypothetical genes predicted solely by ab initio means. Among these, 16 candidate genes could potentially be involved in drought stress response

    Discovery of genomic variants associated with genebank historical traits for rice improvement: SNP and indel data, phenotypic data, and GWAS results

    No full text
    This dataset provides supporting information for Sanciangco et al (submitted) consisting of: A) file list, tables of phenotypes for quantitative and categorical traits and trait descriptions, and tables of SNP/indel numbers for Filtered, LD-pruned and subpopulation datasets (7 files named as "00_*"); B) plink files for Filtered and LD-pruned SNP/indel datasets for all genotypes and for indica, japonica and aus subsets (15 fIles named as "01_*"); C) EMMAX results on Filtered dataset for 12 quantitative traits on All, Aus, Indica, and Japonica genotypes and corresponding Manhattan and QQ plots (144 files named as "0[2345]_*"); D) EMMAX results on LD-pruned dataset for 12 quantitative traits on All, Aus, Indica, and Japonica genotypes and corresponding Manhattan and QQ plots (72 files named as "0[6789]_*"); E) EMMAX results on LD-pruned dataset for 20 categorical traits treated as numeric on All genotypes and corresponding Manhattan and Q-Q plots (60 files named as "10_*"); F) Anova results obtained on numerically transformed LD-pruned dataset for 20 categorical traits on All genotypes and corresponding Manhattan plots (40 files named as "11_*")

    SNP-Seek II: A resource for allele mining and analysis of big genomic data in Oryza sativa

    No full text
    The 3000 Rice Genomes Project generated a large dataset of genomic variation to the world’s most important crop, Oryza sativa L. Using the Burrows-Wheeler Aligner (BWA) and the Genome Analysis Toolkit (GATK) variant calling on this dataset, we identified ∼40 M single-nucleotide polymorphisms (SNPs). Five reference genomes of rice representing the major variety groups were used: Nipponbare (temperate japonica), IR 64 (indica), 93–11 (indica), DJ 123 (aus), and Kasalath (aus). The results are accessible through the Rice SNP-Seek Database (http://snp-seek.irri.org) and through web services of the application programming interface (API). We incorporated legacy phenotypic and passport data for the sequenced varieties originating from the International Rice Genebank Collection Information System (IRGCIS) and gene models from several rice annotation projects. The massive genotypic data in SNP-Seek are stored using hierarchical data format 5 (HDF5) files for quick retrieval. Germplasm, phenotypic, and genomic data are stored in a relational database management system (RDBMS) using the Chado schema, allowing the use of controlled vocabularies from biological ontologies as query constraints in SNP-Seek. In this paper, we discuss the datasets stored in SNP-Seek, architecture of the database and web application, interoperability methodologies in place, and discuss a few use cases demonstrating the utility of SNP-Seek for diversity analysis and molecular breeding

    Rice SNP-seek database update: New SNPs, indels, and queries

    No full text
    We describe updates to the Rice SNP-Seek Database since its first release. We ran a new SNP-calling pipeline followed by filtering that resulted in complete, base, filtered and core SNP datasets. Besides the Nipponbare reference genome, the pipeline was run on genome assemblies of IR 64, 93-11, DJ 123 and Kasalath. New genotype query and display features are added for reference assemblies, SNP datasets and indels. JBrowse now displays BAM, VCF and other annotation tracks, the additional genome assemblies and an embedded VISTA genome comparison viewer. Middleware is redesigned for improved performance by using a hybrid of HDF5 and RDMS for genotype storage. Query modules for genotypes, varieties and genes are improved to handle various constraints. An integrated list manager allows the user to pass query parameters for further analysis. The SNP Annotator adds traits, ontology terms, effects and interactions to markers in a list. Webservice calls were implemented to access most data. These features enable seamless querying of SNPSeek across various biological entities, a step toward semi-automated gene-trait association discovery. URL: http://snp-seek.irri.org
    corecore